Question Analysis Report

Generated: 2025-07-04T00:53:10.711409

Executive Summary

Dataset Size:
9,098 observations
Features:
501 total
Models Analyzed:
7 outcomes
Best R²:
0.151

Model Performance Summary

Outcome Intercept Adj. R² F-statistic F p-value AIC BIC RMSE N Significant Features High VIF Features Mean VIF Max VIF Sample Size
news_proportion_left_leaning 16.0563*** 0.1253 0.1199 23.13 0.0000 88026.7 88432.3 30.4362 25 0 1.85 3.94 9,098
news_proportion_right_leaning 1.5599* 0.0743 0.0685 12.95 0.0000 67917.8 68323.4 10.0796 23 0 1.85 3.94 9,098
news_proportion_center_leaning 82.1115*** 0.1511 0.1459 28.75 0.0000 88595.2 89000.8 31.4021 22 0 1.85 3.94 9,098
news_proportion_unknown_leaning 0.2724 0.0168 0.0108 2.77 0.0000 58245.5 58651.1 5.9236 9 0 1.85 3.94 9,098
news_proportion_high_quality 73.0086*** 0.1257 0.1203 23.21 0.0000 90279.3 90684.9 34.4473 29 0 1.85 3.94 9,098
news_proportion_low_quality 3.7392*** 0.0510 0.0451 8.67 0.0000 76356.2 76761.8 16.0267 16 0 1.85 3.94 9,098
news_proportion_unknown_quality 23.2522*** 0.1366 0.1313 25.55 0.0000 89037.3 89442.9 32.1744 28 0 1.85 3.94 9,098

Correlation Matrix

Feature Importance

Regression Coefficients by Outcome

news_proportion_left_leaning (R² = 0.125, 39 features)

news_proportion_right_leaning (R² = 0.074, 39 features)

news_proportion_center_leaning (R² = 0.151, 39 features)

news_proportion_unknown_leaning (R² = 0.017, 39 features)

news_proportion_high_quality (R² = 0.126, 39 features)

news_proportion_low_quality (R² = 0.051, 39 features)

news_proportion_unknown_quality (R² = 0.137, 39 features)

Model Family Comparisons

proportion_left_leaning

proportion_right_leaning

proportion_high_quality

proportion_news

num_citations

Multicollinearity Diagnostics

Interpretation: Variance Inflation Factor (VIF) measures multicollinearity.

news_proportion_left_leaning (High VIF: 0, Mean VIF: 1.85)

news_proportion_right_leaning (High VIF: 0, Mean VIF: 1.85)

news_proportion_center_leaning (High VIF: 0, Mean VIF: 1.85)

news_proportion_unknown_leaning (High VIF: 0, Mean VIF: 1.85)

news_proportion_high_quality (High VIF: 0, Mean VIF: 1.85)

news_proportion_low_quality (High VIF: 0, Mean VIF: 1.85)

news_proportion_unknown_quality (High VIF: 0, Mean VIF: 1.85)

Summary Statistics

Variable Type Mean Std Min Max N Missing
num_citations Citation Outcome 5.7652 5.1669 0.0000 46.0000 32,400 0
proportion_high_quality Citation Outcome 8.9662 21.3873 0.0000 100.0000 32,400 0
proportion_left_leaning Citation Outcome 1.6659 7.4564 0.0000 100.0000 32,400 0
proportion_right_leaning Citation Outcome 0.0819 1.2404 0.0000 50.0000 32,400 0
news_proportion_high_quality Citation Outcome 21.8282 39.9889 0.0000 100.0000 32,400 0
news_proportion_left_leaning Citation Outcome 4.7865 18.8207 0.0000 100.0000 32,400 0
news_proportion_right_leaning Citation Outcome 0.3746 5.5664 0.0000 100.0000 32,400 0
proportion_news Citation Outcome 10.7818 23.2094 0.0000 100.0000 32,400 0
turn_number Question/Response Feature 1.7057 2.0636 1.0000 39.0000 32,400 0
total_turns Question/Response Feature 2.5335 3.5807 1.0000 50.0000 32,400 0
question_length_chars_log Question/Response Feature -0.0000 1.0000 -3.8234 2.6358 32,400 0
question_length_words_log Question/Response Feature 0.0000 1.0000 -2.2585 2.9189 32,400 0
response_length_log Question/Response Feature -0.0000 1.0000 -7.0885 3.1220 32,400 0
response_word_count_log Question/Response Feature -0.0000 1.0000 -5.6188 2.9660 32,400 0
model_family_google Model Family 7,563 observations 23.3% - - 32,400 0
model_family_openai Model Family 11,168 observations 34.5% - - 32,400 0
model_family_perplexity Model Family 13,669 observations 42.2% - - 32,400 0

Technical Details

Regression Method: OLS_statsmodels

PCA Precomputed: True

PCA Used: True

Total Features: 58